323

Question 6.2

With the BLAST search there is a double acceleration, because a second good index hit

must be there before the exact alignment is started.

Here again as a reminder the tutorial on how to find a sequence:

https://blast.ncbi.nlm.nih.gov/blastcgihelp.shtml

Now for comparison, here is a FASTA server that only works with one hit:

https://fasta.bioch.virginia.edu/fasta_www2/

For an unknown sequence, it can make sense to try both options, since both servers

produce different results depending on the sequence. However, the BLAST server is faster.

Finally, the hits found can also be used in an alignment for the overall search:

https://www.ncbi.nlm.nih.gov/books/NBK2590/

https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE=Proteins&PROGRAM=b

lastp&RUN_PSIBLAST=on

Question 6.3

(a) A further acceleration of the sequence comparison is e.g. the BLAT search:https://

genome.ucsc.edu/FAQ/FAQblat.html

(b) The tutorial also explains the advantages, namely even faster than the BLAST

search, index search goes over a whole genome. Disadvantage: Less “depth”, so

distant similarities are not detected as reliably.

Question 6.4

An analogy: Nothing goes faster than the speed of light. That’s why you have to be pre­

pared for long waiting times (years!) when travelling to the stars. Therefore, the fastest

way is not to go at all, but to think!

All our sequence comparisons try to find out which protein is present, i.e. what its

annotation (bioinformatic functional description) or function is. We have just learned

about examples: BLAST, Psi-Blast, FASTA, other BLAST variants. All these searches are

heuristic, i.e. fast, but not quite exact. There are also exact searches. This is global sequence

comparison using Needleman and Wunsch algorithm and local sequence comparison

using Smith-Waterman algorithm. Further possibilities are searches in domain databases

like SMART, ProDom, protein family databases like Pfam, finally also specialized searches

like BLOCKS similarity search – but (see above): The fastest way is to use the correct

annotation. Where is the best place to find it? Investigate this right away in task 6.5.

Question 6.5

• Annotation in GenBank is a very good standard annotation (detailed description of the

properties of the gene or protein or RNA molecule). Here, however, the annotation is

filed by the author after checking and proofreading by NCBI. In this respect, there are

differences in the depth or detail of the annotation. This is particularly evident in the

20.6  Extremely Fast Sequence Comparisons Identify all the Molecules that Are Present…